Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 449.456
Filtrar
1.
Arch Virol ; 169(5): 110, 2024 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-38664287

RESUMO

Advancements in high-throughput sequencing and the development of new bioinformatics tools for large-scale data analysis play a crucial role in uncovering virus diversity and enhancing our understanding of virus evolution. The discovery of the ormycovirus clades, a group of RNA viruses that are phylogenetically distinct from all known Riboviria members and are found in fungi, highlights the value of these tools for the discovery of novel viruses. The aim of this study was to examine viral populations in fungal hosts to gain insights into the diversity, evolution, and classification of these viruses. Here, we report the molecular characterization of a newly discovered ormycovirus, which we have named "Hortiboletus rubellus ormycovirus 1" (HrOMV1), that was found in the ectomycorrhizal fungus Hortiboletus rubellus. The bipartite genome of HrOMV1, whose nucleotide sequence was determined by HTS and RLM-RACE, consists of two RNA segments (RNA1 and RNA2) that exhibit similarity to those of previously studied ormycoviruses in their organization and the proteins they encode. The presence of upstream, in-frame AUG triplets in the 5' termini of both RNA segments suggests that HrOMV1, like certain other ormycoviruses, employs a non-canonical translation initiation strategy. Phylogenetic analysis showed that HrOMV1 is positioned within the gammaormycovirus clade. Its putative RNA-dependent RNA polymerase (RdRp) exhibits sequence similarity to those of other gammaormycovirus members, the most similarity to that of Termitomyces ormycovirus 1, with 33.05% sequence identity. This protein was found to contain conserved motifs that are crucial for RNA replication, including the distinctive GDQ catalytic triad observed in gammaormycovirus RdRps. The results of this study underscore the significance of investigating the ecological role of mycoviruses in mycorrhizal fungi. This is the first report of an ormycovirus infecting a member of the ectomycorrhizal genus Hortiboletus.


Assuntos
Genoma Viral , Micorrizas , Filogenia , Vírus de RNA , Vírus de RNA/genética , Vírus de RNA/classificação , Vírus de RNA/isolamento & purificação , Micorrizas/genética , Micorrizas/virologia , Micovírus/genética , Micovírus/classificação , Micovírus/isolamento & purificação , RNA Viral/genética , Sequenciamento de Nucleotídeos em Larga Escala , Proteínas Virais/genética , Fases de Leitura Aberta , Sequência de Bases
2.
Cell Rep ; 43(4): 114082, 2024 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-38583155

RESUMO

Infections caused by methicillin-resistant Staphylococcus aureus (MRSA) are alarmingly common, and treatment is confined to last-line antibiotics. Vancomycin is the treatment of choice for MRSA bacteremia, and treatment failure is often associated with vancomycin-intermediate S. aureus isolates. The regulatory 3' UTR of the vigR mRNA contributes to vancomycin tolerance and upregulates the autolysin IsaA. Using MS2-affinity purification coupled with RNA sequencing, we find that the vigR 3' UTR also regulates dapE, a succinyl-diaminopimelate desuccinylase required for lysine and peptidoglycan synthesis, suggesting a broader role in controlling cell wall metabolism and vancomycin tolerance. Deletion of the 3' UTR increased virulence, while the isaA mutant is completely attenuated in a wax moth larvae model. Sequence and structural analyses of vigR indicated that the 3' UTR has expanded through the acquisition of Staphylococcus aureus repeat insertions that contribute sequence for the isaA interaction seed and may functionalize the 3' UTR.


Assuntos
Regiões 3' não Traduzidas , Virulência/genética , Regiões 3' não Traduzidas/genética , Staphylococcus aureus/genética , Staphylococcus aureus/patogenicidade , Staphylococcus aureus/efeitos dos fármacos , Animais , Infecções Estafilocócicas/microbiologia , Infecções Estafilocócicas/tratamento farmacológico , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Staphylococcus aureus Resistente à Meticilina/genética , Staphylococcus aureus Resistente à Meticilina/patogenicidade , Staphylococcus aureus Resistente à Meticilina/efeitos dos fármacos , Regulação Bacteriana da Expressão Gênica , Mariposas/microbiologia , Vancomicina/farmacologia , Antibacterianos/farmacologia , Sequência de Bases
3.
Nat Commun ; 15(1): 3323, 2024 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-38637518

RESUMO

Direct RNA sequencing offers the possibility to simultaneously identify canonical bases and epi-transcriptomic modifications in each single RNA molecule. Thus far, the development of computational methods has been hampered by the lack of biologically realistic training data that carries modification labels at molecular resolution. Here, we report on the synthesis of such samples and the development of a bespoke algorithm, mAFiA (m6A Finding Algorithm), that accurately detects single m6A nucleotides in both synthetic RNAs and natural mRNA on single read level. Our approach uncovers distinct modification patterns in single molecules that would appear identical at the ensemble level. Compared to existing methods, mAFiA also demonstrates improved accuracy in measuring site-level m6A stoichiometry in biological samples.


Assuntos
Nucleotídeos , RNA , RNA/genética , RNA Mensageiro/genética , Sequência de Bases , Análise de Sequência de RNA/métodos
4.
Sci Rep ; 14(1): 9000, 2024 04 18.
Artigo em Inglês | MEDLINE | ID: mdl-38637641

RESUMO

Long-read genome sequencing (lrGS) is a promising method in genetic diagnostics. Here we investigate the potential of lrGS to detect a disease-associated chromosomal translocation between 17p13 and the 19 centromere. We constructed two sets of phased and non-phased de novo assemblies; (i) based on lrGS only and (ii) hybrid assemblies combining lrGS with optical mapping using lrGS reads with a median coverage of 34X. Variant calling detected both structural variants (SVs) and small variants and the accuracy of the small variant calling was compared with those called with short-read genome sequencing (srGS). The de novo and hybrid assemblies had high quality and contiguity with N50 of 62.85 Mb, enabling a near telomere to telomere assembly with less than a 100 contigs per haplotype. Notably, we successfully identified the centromeric breakpoint of the translocation. A concordance of 92% was observed when comparing small variant calling between srGS and lrGS. In summary, our findings underscore the remarkable potential of lrGS as a comprehensive and accurate solution for the analysis of SVs and small variants. Thus, lrGS could replace a large battery of genetic tests that were used for the diagnosis of a single symptomatic translocation carrier, highlighting the potential of lrGS in the realm of digital karyotyping.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Translocação Genética , Humanos , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequência de Bases , Centrômero/genética
5.
Biotechniques ; 76(5): 203-215, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38573592

RESUMO

In the absence of a DNA template, the ab initio production of long double-stranded DNA molecules of predefined sequences is particularly challenging. The DNA synthesis step remains a bottleneck for many applications such as functional assessment of ancestral genes, analysis of alternative splicing or DNA-based data storage. In this report we propose a fully in vitro protocol to generate very long double-stranded DNA molecules starting from commercially available short DNA blocks in less than 3 days using Golden Gate assembly. This innovative application allowed us to streamline the process to produce a 24 kb-long DNA molecule storing part of the Declaration of the Rights of Man and of the Citizen of 1789 . The DNA molecule produced can be readily cloned into a suitable host/vector system for amplification and selection.


Assuntos
DNA , DNA/genética , DNA/química , Armazenamento e Recuperação da Informação/métodos , Humanos , Sequência de Bases/genética , Clonagem Molecular/métodos
6.
Bioorg Med Chem ; 104: 117700, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38583236

RESUMO

Adenosine Deaminases Acting on RNA (ADARs) catalyze the deamination of adenosine to inosine in double-stranded RNA (dsRNA). ADARs' ability to recognize and edit dsRNA is dependent on local sequence context surrounding the edited adenosine and the length of the duplex. A deeper understanding of how editing efficiency is affected by mismatches, loops, and bulges around the editing site would aid in the development of therapeutic gRNAs for ADAR-mediated site-directed RNA editing (SDRE). Here, a SELEX (systematic evolution of ligands by exponential enrichment) approach was employed to identify dsRNA substrates that bind to the deaminase domain of human ADAR2 (hADAR2d) with high affinity. A library of single-stranded RNAs was hybridized with a fixed-sequence target strand containing the nucleoside analog 8-azanebularine that mimics the adenosine deamination transition state. The presence of this nucleoside analog in the library biased the screen to identify hit sequences compatible with adenosine deamination at the site of 8-azanebularine modification. SELEX also identified non-duplex structural elements that supported editing at the target site while inhibiting editing at bystander sites.


Assuntos
Adenosina Desaminase , Nucleosídeos de Purina , Ribonucleosídeos , Humanos , Adenosina , Adenosina Desaminase/metabolismo , Sequência de Bases , RNA de Cadeia Dupla , RNA Guia de Sistemas CRISPR-Cas
7.
Sci Rep ; 14(1): 7636, 2024 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-38561351

RESUMO

Abies koreana E.H.Wilson is an endangered evergreen coniferous tree that is native to high altitudes in South Korea and susceptible to the effects of climate change. Hybridization and reticulate evolution have been reported in the genus; therefore, multigene datasets from nuclear and cytoplasmic genomes are needed to better understand its evolutionary history. Using the Illumina NovaSeq 6000 and Oxford Nanopore Technologies (ONT) PromethION platforms, we generated complete mitochondrial (1,174,803 bp) and plastid (121,341 bp) genomes from A. koreana. The mitochondrial genome is highly dynamic, transitioning from cis- to trans-splicing and breaking conserved gene clusters. In the plastome, the ONT reads revealed two structural conformations of A. koreana. The short inverted repeats (1186 bp) of the A. koreana plastome are associated with different structural types. Transcriptomic sequencing revealed 1356 sites of C-to-U RNA editing in the 41 mitochondrial genes. Using A. koreana as a reference, we additionally produced nuclear and organelle genomic sequences from eight Abies species and generated multiple datasets for maximum likelihood and network analyses. Three sections (Balsamea, Momi, and Pseudopicea) were well grouped in the nuclear phylogeny, but the phylogenomic relationships showed conflicting signals in the mitochondrial and plastid genomes, indicating a complicated evolutionary history that may have included introgressive hybridization. The obtained data illustrate that phylogenomic analyses based on sequences from differently inherited organelle genomes have resulted in conflicting trees. Organelle capture, organelle genome recombination, and incomplete lineage sorting in an ancestral heteroplasmic individual can contribute to phylogenomic discordance. We provide strong support for the relationships within Abies and new insights into the phylogenomic complexity of this genus.


Assuntos
Abies , Filogenia , Abies/genética , Sequência de Bases , Cycadopsida/genética , Plastídeos/genética
8.
Sci Rep ; 14(1): 7648, 2024 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-38561388

RESUMO

Natural killer (NK) cells play essential roles in the tumor development, diagnosis, and prognosis of tumors. In this study, we aimed to establish a reliable signature based on marker genes in NK cells, thus providing a new perspective for assessing immunotherapy and the prognosis of patients with gastric cancer (GC). We analyzed a total of 1560 samples retrieved from the public database. We performed a comprehensive analysis of single-cell RNA-sequencing (scRNA-seq) data of gastric cancer and identified 377 marker genes for NK cells. By performing Cox regression analysis, we established a 12-gene NK cell-associated signature (NKCAS) for the Cancer Genome Atlas (TCGA) cohort, that assigned GC patients into a low-risk group (LRG) or a high-risk group (HRG). In the TCGA cohort, the areas under curve (AUC) value were 0.73, 0.81, and 0.80 at 1, 3, and 5 years. External validation of the predictive ability for the signature was then validated in the Gene Expression Omnibus (GEO) cohorts (GSE84437). The expression levels of signature genes were measured and validated in GC cell lines by real-time PCR. Moreover, NKCAS was identified as an independent prognostic factor by multivariate analysis. We combined this with a variety of clinicopathological characteristics (age, M stage, and tumor grade) to construct a nomogram to predict the survival outcomes of patients. Moreover, the LRG showed higher immune cell infiltration, especially CD8+ T cells and NK cells. The risk score was negatively associated with inflammatory activities. Importantly, analysis of the independent immunotherapy cohort showed that the LRG had a better prognosis and immunotherapy response when compared with the HRG. The identification of NK cell marker genes in this study suggests potential therapeutic targets. Additionally, the developed predictive signatures and nomograms may aid in the clinical management of GC.


Assuntos
Neoplasias Gástricas , Humanos , Neoplasias Gástricas/genética , Neoplasias Gástricas/terapia , Prognóstico , Sequência de Bases , Imunoterapia , RNA , Microambiente Tumoral
9.
Cell Biochem Funct ; 42(3): e3994, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38566355

RESUMO

This study aimed to investigate the expression pattern and mechanisms of Pyruvate Dehydrogenase Phosphatase Catalytic Subunit 1 (PDP1) in the progression of breast cancer (BC). PDP1, known for its involvement in cell energy metabolism, was found to be overexpressed in BC tissues. Notably, low PDP1 expression aligns with improved overall survival (OS) in BC patients. In this study, we found that PDP1 was overexpressed among BC tissues and low PDP1 expression showed a better prognosis for the patients with BC. PDP1 knockdown suppressed cell amplification and migration and triggered cell apoptosis in BC cells. In vivo assessments through a xenograft model unveiled the pivotal role and underlying mechanisms of PDP1 knockdown. RNA sequencing and kyoto encyclopedia of genes and genomes analysis of RNAs from PDP1 knockdown and normal MCF7 cells revealed 1440 differentially expressed genes, spotlighting the involvement of the JAK/STAT3 signaling pathway in BC progression. Western blot results implied that PDP1 knockdown led to a loss of p-STAT3, whereas overexpression of PDP1 induced the p-STAT3 expression. Cell counting kit-8 assay showed that PDP1 overexpression significantly raised MDA-MB-231 and MCF7 cell viability while STAT3 inhibitor S3I-201 recovered the cell growth to normal level. To summarize, PDP1 promotes the progression of BC through STAT3 pathway by regulating p-STAT3. The findings contribute to understanding the molecular mechanisms underlying BC progression, and opening avenues for targeted therapeutic approaches.


Assuntos
Neoplasias da Mama , Feminino , Humanos , Sequência de Bases , Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Linhagem Celular Tumoral , Proliferação de Células , Células MCF-7 , Transdução de Sinais , Fator de Transcrição STAT3/genética , Fator de Transcrição STAT3/metabolismo
11.
Arthritis Res Ther ; 26(1): 78, 2024 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-38570801

RESUMO

BACKGROUND: Transitioning from a genetic association signal to an effector gene and a targetable molecular mechanism requires the application of functional fine-mapping tools such as reporter assays and genome editing. In this report, we undertook such studies on the osteoarthritis (OA) risk that is marked by single nucleotide polymorphism (SNP) rs34195470 (A > G). The OA risk-conferring G allele of this SNP associates with increased DNA methylation (DNAm) at two CpG dinucleotides within WWP2. This gene encodes a ubiquitin ligase and is the host gene of microRNA-140 (miR-140). WWP2 and miR-140 are both regulators of TGFß signaling. METHODS: Nucleic acids were extracted from adult OA (arthroplasty) and foetal cartilage. Samples were genotyped and DNAm quantified by pyrosequencing at the two CpGs plus 14 flanking CpGs. CpGs were tested for transcriptional regulatory effects using a chondrocyte cell line and reporter gene assay. DNAm was altered using epigenetic editing, with the impact on gene expression determined using RT-qPCR. In silico analysis complemented laboratory experiments. RESULTS: rs34195470 genotype associates with differential methylation at 14 of the 16 CpGs in OA cartilage, forming a methylation quantitative trait locus (mQTL). The mQTL is less pronounced in foetal cartilage (5/16 CpGs). The reporter assay revealed that the CpGs reside within a transcriptional regulator. Epigenetic editing to increase their DNAm resulted in altered expression of the full-length and N-terminal transcript isoforms of WWP2. No changes in expression were observed for the C-terminal isoform of WWP2 or for miR-140. CONCLUSIONS: As far as we are aware, this is the first experimental demonstration of an OA association signal targeting specific transcript isoforms of a gene. The WWP2 isoforms encode proteins with varying substrate specificities for the components of the TGFß signaling pathway. Future analysis should focus on the substrates regulated by the two WWP2 isoforms that are the targets of this genetic risk.


Assuntos
MicroRNAs , Osteoartrite , Adulto , Humanos , Sequência de Bases , Ubiquitina/genética , Ubiquitina/metabolismo , Isoformas de Proteínas/metabolismo , Ubiquitina-Proteína Ligases/genética , Ubiquitina-Proteína Ligases/metabolismo , Metilação de DNA/genética , MicroRNAs/metabolismo , Osteoartrite/genética , Osteoartrite/metabolismo , Fator de Crescimento Transformador beta/metabolismo
12.
HLA ; 103(4): e15468, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38575356

RESUMO

HLA-DQB1*02:01:01:21Q differs from HLA-DQB1*02:01:01:01 by one nucleotide substitution in the splice site in the beginning of intron 3.


Assuntos
Sequência de Bases , Humanos , Alelos , Cadeias beta de HLA-DQ/genética , Íntrons
13.
HLA ; 103(4): e15409, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38575362

RESUMO

The novel allele HLA-DPB1*1467:01 differs from HLA-DPB1*09:01:01:01 by one non-synonymous nucleotide substitution in exon 2.


Assuntos
Sequência de Bases , Humanos , Alelos , Cadeias beta de HLA-DP/genética , Éxons/genética , Análise de Sequência de DNA
14.
HLA ; 103(4): e15473, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38575364
15.
HLA ; 103(4): e15500, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38661074

RESUMO

Full length sequence characterisation of the novel HLA-DQA1*05:107 allele from whole genome sequencing data.


Assuntos
Alelos , Cadeias alfa de HLA-DQ , Humanos , Cadeias alfa de HLA-DQ/genética , Éxons , Teste de Histocompatibilidade , Sequência de Bases , Sequenciamento Completo do Genoma
16.
Sci Rep ; 14(1): 8258, 2024 04 09.
Artigo em Inglês | MEDLINE | ID: mdl-38589409

RESUMO

Major depressive disorder (MDD) is a complex and potentially debilitating illness whose etiology and pathology remains unclear. Non-coding RNAs have been implicated in MDD, where they display differential expression in the brain and the periphery. In this study, we quantified small nucleolar RNA (snoRNA) expression by small RNA sequencing in the lateral habenula (LHb) of individuals with MDD (n = 15) and psychiatrically-healthy controls (n = 15). We uncovered five snoRNAs that exhibited differential expression between MDD and controls (FDR < 0.01). Specifically, SNORA69 showed increased expression in MDD and was technically validated via RT-qPCR. We further investigated the expression of Snora69 in the LHb and peripheral blood of an unpredicted chronic mild stress (UCMS) mouse model of depression. Snora69 was specifically up-regulated in mice that underwent the UCMS paradigm. SNORA69 is known to guide pseudouridylation onto 5.8S and 18S rRNAs. We quantified the relative abundance of pseudouridines on 5.8S and 18S rRNA in human post-mortem LHb samples and found increased abundance of pseudouridines in the MDD group. Overall, our findings indicate the importance of brain snoRNAs in the pathology of MDD. Future studies characterizing SNORA69's role in MDD pathology is warranted.


Assuntos
Transtorno Depressivo Maior , Habenula , Humanos , Animais , Camundongos , Transtorno Depressivo Maior/genética , Habenula/metabolismo , Sequência de Bases , RNA Ribossômico 18S , RNA Nucleolar Pequeno/genética , RNA Nucleolar Pequeno/metabolismo
17.
Int J Mol Sci ; 25(7)2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-38612932

RESUMO

In the case of a food poisoning outbreak, it is essential to understand the relationship between cooking workers and food poisoning. Many biological diagnostic methods have recently been developed to detect food poisoning pathogens. Among these diagnostic tools, this study presents PCR-based pulsed-field gel electrophoresis and nucleotide sequencing diagnostic analysis results for diagnosing food poisoning outbreaks associated with cooking employees in Chungcheongnam-do, Republic of Korea. Pulsed-field gel electrophoresis was useful in identifying the food poisoning outbreaks caused by Staphylococcus aureus and Enteropathogenic Escherichia coli. In the case of Norovirus, nucleotide sequencing was used to identify the relationship between cooking workers and the food poisoning outbreak. However, it is difficult to determine whether cooking employees directly caused the food poisoning outbreaks based on these molecular biological diagnostic results alone. A system is needed to integrate epidemiological and diagnostic information to identify a direct correlation between the food poisoning outbreak and cooking employees.


Assuntos
Doenças Transmitidas por Alimentos , Nucleotídeos , Humanos , Eletroforese em Gel de Campo Pulsado , Sequência de Bases , Culinária , Doenças Transmitidas por Alimentos/diagnóstico , Doenças Transmitidas por Alimentos/epidemiologia
18.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38622357

RESUMO

Pseudouridine is an RNA modification that is widely distributed in both prokaryotes and eukaryotes, and plays a critical role in numerous biological activities. Despite its importance, the precise identification of pseudouridine sites through experimental approaches poses significant challenges, requiring substantial time and resources.Therefore, there is a growing need for computational techniques that can reliably and quickly identify pseudouridine sites from vast amounts of RNA sequencing data. In this study, we propose fuzzy kernel evidence Random Forest (FKeERF) to identify pseudouridine sites. This method is called PseU-FKeERF, which demonstrates high accuracy in identifying pseudouridine sites from RNA sequencing data. The PseU-FKeERF model selected four RNA feature coding schemes with relatively good performance for feature combination, and then input them into the newly proposed FKeERF method for category prediction. FKeERF not only uses fuzzy logic to expand the original feature space, but also combines kernel methods that are easy to interpret in general for category prediction. Both cross-validation tests and independent tests on benchmark datasets have shown that PseU-FKeERF has better predictive performance than several state-of-the-art methods. This new method not only improves the accuracy of pseudouridine site identification, but also provides a certain reference for disease control and related drug development in the future.


Assuntos
Pseudouridina , Algoritmo Florestas Aleatórias , Pseudouridina/genética , RNA/genética , Sequência de Bases
19.
BMC Genomics ; 25(1): 365, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38622536

RESUMO

BACKGROUND: Microbial genomes are largely comprised of protein coding sequences, yet some genomes contain many pseudogenes caused by frameshifts or internal stop codons. These pseudogenes are believed to result from gene degradation during evolution but could also be technical artifacts of genome sequencing or assembly. RESULTS: Using a combination of observational and experimental data, we show that many putative pseudogenes are attributable to errors that are incorporated into genomes during assembly. Within 126,564 publicly available genomes, we observed that nearly identical genomes often substantially differed in pseudogene counts. Causal inference implicated assembler, sequencing platform, and coverage as likely causative factors. Reassembly of genomes from raw reads confirmed that each variable affects the number of putative pseudogenes in an assembly. Furthermore, simulated sequencing reads corroborated our observations that the quality and quantity of raw data can significantly impact the number of pseudogenes in an assembler dependent fashion. The number of unexpected pseudogenes due to internal stops was highly correlated (R2 = 0.96) with average nucleotide identity to the ground truth genome, implying relative pseudogene counts can be used as a proxy for overall assembly correctness. Applying our method to assemblies in RefSeq resulted in rejection of 3.6% of assemblies due to significantly elevated pseudogene counts. Reassembly from real reads obtained from high coverage genomes showed considerable variability in spurious pseudogenes beyond that observed with simulated reads, reinforcing the finding that high coverage is necessary to mitigate assembly errors. CONCLUSIONS: Collectively, these results demonstrate that many pseudogenes in microbial genome assemblies are actually genes. Our results suggest that high read coverage is required for correct assembly and indicate an inflated number of pseudogenes due to internal stops is indicative of poor overall assembly quality.


Assuntos
Genoma Bacteriano , Pseudogenes , Pseudogenes/genética , Mapeamento Cromossômico , Sequência de Bases , Genoma Microbiano , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos
20.
Commun Biol ; 7(1): 447, 2024 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-38605212

RESUMO

Protein evolution is constrained by structure and function, creating patterns in residue conservation that are routinely exploited to predict structure and other features. Similar constraints should affect variation across individuals, but it is only with the growth of human population sequencing that this has been tested at scale. Now, human population constraint has established applications in pathogenicity prediction, but it has not yet been explored for structural inference. Here, we map 2.4 million population variants to 5885 protein families and quantify residue-level constraint with a new Missense Enrichment Score (MES). Analysis of 61,214 structures from the PDB spanning 3661 families shows that missense depleted sites are enriched in buried residues or those involved in small-molecule or protein binding. MES is complementary to evolutionary conservation and a combined analysis allows a new classification of residues according to a conservation plane. This approach finds functional residues that are evolutionarily diverse, which can be related to specificity, as well as family-wide conserved sites that are critical for folding or function. We also find a possible contrast between lethal and non-lethal pathogenic sites, and a surprising clinical variant hot spot at a subset of missense enriched positions.


Assuntos
Proteínas , Humanos , Domínios Proteicos , Proteínas/metabolismo , Ligação Proteica , Sequência de Bases
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...